Collective Classification of Posts to Internet Forums
نویسندگان
چکیده
We investigate automatic classification of posts to Internet forums. We use collective classification methods, which simultaneously classify related objects — in our case, the posts in a thread. Specifically, we compare the Iterative Classification Algorithm (ICA) with Conditional Random Fields and with conventional classifiers (k-Nearest Neighbours and Support Vector Machines). The ICA algorithm invokes a local classifier, for which we use the kNN classifier. Our main contributions are two-fold. First, we define experimental protocols that we believe are suitable for offline evaluation in this domain. Second, by using these protocols to run experiments on two datasets, we show that ICA with kNN has significantly higher accuracy across most of the experimental conditions.
منابع مشابه
Collective Stance Classification of Posts in Online Debate Forums
Online debate sites are a large source of informal and opinion-sharing dialogue on current socio-political issues. Inferring users’ stance (PRO or CON) towards discussion topics in domains such as politics or news is an important problem, and is of utility to researchers, government organizations, and companies. Predicting users’ stance supports identification of social and political groups, bu...
متن کاملRecognition of Sentiment Sequences in Online Discussions
Currently 19%-28% of Internet users participate in online health discussions. In this work, we study sentiments expressed on online medical forums. As well as considering the predominant sentiments expressed in individual posts, we analyze sequences of sentiments in online discussions. Individual posts are classified into one of the five categories encouragement, gratitude, confusion, facts, an...
متن کاملThe Psychology of Word Use in Depression Forums
The present studies demonstrate two computerized approaches to examining the expression of depression on the Internet. Study 1 observed linguistic markers of depression in English and Spanish forums. English and Spanish posts by depressed (N=160) and non-depressed individuals (N=160) were collected from Internet forums using bulletin board systems (bbs). A computer program (LIWC2001) was used t...
متن کاملClassification of Online Health Discussions with Text and Health Feature Sets
Nowadays, many health groups and forums are established on the Internet, where health consumers discuss health issues and interact with each other. Although there is a large amount of user generated content about healthcare on different social media sites, few studies have applied data mining or artificial intelligence techniques for knowledge discovery on a large scale of data in this particul...
متن کاملThe Psychology of Word Use in Depression Forums in English and in Spanish: Texting Two Text Analytic Approaches
The present studies demonstrate two computerized approaches to examining the expression of depression on the Internet. Study 1 observed linguistic markers of depression in English and Spanish forums. English and Spanish posts by depressed (N=160) and non-depressed individuals (N=160) were collected from Internet forums using bulletin board systems (bbs). A computer program (LIWC2001) was used t...
متن کامل